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Abstract 

We report on the results of a search for standard model Higgs bosons produced in association 
with W bosons from pp collisions at -s/s = 1.96 TeV. The search uses a data sample corresponding 
to approximately 1 fb~^ of integrated luminosity. Events consistent with the W ^ and H ^ bb 
signature are selected by triggering on a high-p^^ electron or muon candidate and tagging one or 
two of the jet candidates as having originated from b quarks. A neural network filter rejects a 
fraction of tagged charm and light flavor jets, increasing the 6-jet purity in the sample and thereby 
reducing the background to Higgs boson production. We observe no excess ivbb production beyond 
the background expectation, and we set 95% confidence level upper limits on the production cross 
section times branching fraction a{pp WH) ■ Br{H bb) ranging from 3.9 to 1.3 pb, for specific 
Higgs boson mass hypotheses in the range 110 to 150GeV/c^, respectively. 

PACS numbers: 13.85.Rm, 14.80.Bn 
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The standard model of elementary particle physics (SM) provides for electroweak gauge 
symmetry breaking via the Higgs mechanism and the model predicts a single physical 
remnant of the added Higgs field. This remnant, the Higgs boson H, has yet to be observed 
experimentally. Results from direct searches at the LEP collider exclude mass values less 
than 114.4 GeV/c^ at a 95% confidence level [2], and global fits to precision electroweak 
data exclude masses greater than 144GeV/c^ at 95% confidence level j3|]. (Some models 
beyond the SM predict Higgs bosons whose masses are not constrained by these limits.) For 
Higgs boson masses just above the range excluded by LEP, the decay to bottom quarks hb 
dominates. Even though gluon fusion gg ^ H ^ bb has the largest cross section among 
Higgs production processes in pp collisions ^|), the bb data sample is dominated by non- 
resonant multi-jet background. Consequently, we search for WH production, requiring a 
leptonic W boson decay to suppress the background. In this Letter we report results of a 
search for low-mass SM Higgs bosons produced in association with W bosons and decaying 
to bb pairs. The resulting iubb final state is identified by selecting events with exactly one 
high-energy electron or muon candidate, large missing transverse energy, and one or two jet 
candidates having a secondary vertex characteristic of heavy quark decay. 

Recent searches at CDF and DO were limited not only by smaller data samples, 

but also by contamination from jets associated with charm or light quarks which are falsely 
tagged as b jets. The search described in this Letter employs for the first time a neural 
network filter to reject such events, thereby improving the purity of the selected event 
sample. The data sample of pp collisions at a/s = 1.96 TeV used here corresponds to 
0.955 ± 0.057 fb^^ of integrated luminosity, nearly three times the sample used in previous 
searches. 

The CDF II detector is a general-purpose detector located at the Tevatron pp collider at 
Fermilab [7,Q]. It consists of a cylindrical magnetic spectrometer surrounded by sampling 
calorimeters used to measure energies of electromagnetic showers and jets. Charged particle 
tracking is performed with microstrip silicon detectors surrounded by a large cylindrical 
multilayer drift chamber, both immersed in a solenoidal magnetic field. Jets are identified 
as a collection of hadronic and electromagnetic calorimeter towers, which are clustered using 
an iterative cone algorithm with a cone of AR = ^ (A0)^ + (AriY = 0.4 units in the azimuth- 
pseudorapidity space [o, [lo|. Planar drift chambers used for muon detection surround the 
calorimeters at least five interaction lengths from the interaction region. 
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Events are collected using high-pr electron or muon triggers with a three-level selection 
filter. The first- and second-level criteria ensure that purely electromagnetic calorimeter 
clusters exist or that track stubs in the muon chambers align with drift chamber tracks 
having transverse momentum at least 8GeV/c. The third-level trigger ensures that a fully- 
reconstructed track with px at least 18 GeV/c points to the electromagnetic cluster or muon 
stub. 

Events compatible with the ^vhh final state are selected by requiring exactly one electron 
or muon candidate and missing transverse energy > 20 GeV, after jets are corrected 
for detector imperfections and non-linear calorimeter response 91]. The electron or muon 
must be within the central part of the detector, in the pseudorapidity regions \ri\ < 1.1 or 
\ri\ < 1.0, respectively, and must have transverse energy greater than 20 GeV. The lepton 
must be isolated from the rest of the event by a cone of radius AR = 0.4 containing no 
more than 10% of the lepton energy (excluding the lepton itself). It must also be no more 
than 5 cm in z away from the primary event vertex, which is defined by fitting a subset 
of charged particle tracks in the event to a single vertex. To suppress background from Z 
boson and diboson production, we reject events with more than one isolated lepton, as well as 
events in which the lepton and another high-energy track of opposite sign form an invariant 
mass between 76 and 106GeV/c^. The requirements for the lepton transverse energy and 
missing transverse energy reject multi-jet background while maintaining efficiency for the 
WH signal. Jets used in the analysis must fall within the acceptance of the silicon detector 
(|?7| < 2.0) for reliable 6-tagging, and they must have transverse energy greater than 15 GeV. 
Even though the 1^ + 2 jet final state is the target sample for this search, other samples with 
W+1 or Vr-|-3, 4 jets are useful for cross-checks of the background estimates with similar 
topologies. 

A B hadron, with relatively long lifetime and large mass, can decay to charged particles 
whose tracks have large impact parameter, the distance of closest approach to the interaction 
point in the transverse plane. Such tracks are fit to a secondary vertex, and the decay 
length of the B hadron is defined as the distance between this vertex and the^ primary 
vertex. Specifically, we apply the secvtx secondary vertex finding algorithm [11] to each 
jet in the event, using tracks within the AR = 0.4 cone centered on the jet axis. Three 
tracks with impact parameter significances greater than 2.0 are fit to a decay vertex. If 
this first pass fails, a second pass is attempted with two tracks having impact parameter 
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significances greater than 3.0. To limit the number of secondary vertices stemming from 
material interactions, we reject vertices at radii greater than 2.5 cm, as well as any vertices 
reconstructed within the material of the beampipe and the innermost silicon layer (1.2 cm < 
r < 1.5 cm). Jets are 6-tagged if the magnitude of the significance of the transverse decay 
length is greater than 7.5. Jets with a negative decay length have a reconstructed flight 
direction opposite the jet direction. This can happen when tracks coming from the primary 
vertex have significantly mismeasured impact parameters. 

In addition to the secondary vertex finding algorithm, a neural network (NN) filter has 
been trained with the jetnet program 12| to reject tagged jets originating from charm 
or light {u, d, s) quarks. When a b quark hadronizes, particles originating at the B hadron 
vertex tend to carry a large fraction of the jet energy. They also tend to have a higher vertex 
mass - the invariant mass of all tracks in the secondary vertex - and longer decay lengths 
than in jets from light quarks. The NN filter uses these jet characteristics to discriminate 
between b jets and c or light flavor jets. It is composed of two networks in series, one to 
separate b jets from light quark jets, and the second to separate b jets from c jets. Both 
networks have the same set of 16 inputs: the number of tracks in the secondary vertex, the 

value of the vertex fit, the transverse decay length and its significance, the vertex mass 
calculated by assuming the charged pion mass for all particles, the proper time assuming 
the vertex mass, the fraction of the jet px carried by tracks in the vertex, the vertex pass 
number, the number of tracks with significant impact parameter, the reconstructed mass of 
the SECVTX pass 1 and pass 2 tracks, the numbers of pass 1 and pass 2 tracks, the fraction 
of the jet pt carried by the pass 1 and pass 2 tracks, and finally the probability of a selected 
ensemble of tracks to have originated at the primary vertex [li^]. The selection cuts on the 
NN output are chosen to give 90% efficiency for true b jets identified with the secondary 
vertexing algorithm. The corresponding rejection factors are 65 ± 5% for light flavor jets 
and 50 ± 5% for charm jets, as measured using simulated events and verified with multijet 
data. 

Our search criteria select events with exactly one high-energy charged lepton, missing 
transverse energy, and two jets. The search sensitivity is maximized by defining two distinct 
subsamples based on the following 6-tagging requirements: single-tagged events with exactly 
one 6-tagged jet which passes the NN filter, and double-tagged events with two 6-tagged 
jets. Because events with charm and light-flavor jets are unlikely to be double-tagged, the 
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extra NN filter is not applied to double-tagged events. The selected event sample includes 
contributions from other SM processes. The largest background rates are due to ly+jets 
production, ti production, and non-W multijet production, with small contributions from 
electroweak boson production WW or WZ. 

The dominant background contribution comes from W^+jets production, either with jets 
from 6 or c quarks or with jets mistagged by the 6-tagging algorithm. The effect of true 
PF+heavy-flavor production is estimated from a combination of data and simulation. We 
use the alpgen Monte Carlo program [l^ to calculate the rate of Wbb, Wcc, and Wc 
production relative to inclusive VT+jets production. Then this relative rate is applied to the 
observed VT+jets sample, after non-W^ and tt contributions have been subtracted. Finally, 
we apply a 6-tagging efficiency and NN filter rate derived using the alpgen event samples. 

Events from ti production followed by leptonic W decay typically have two b jets from t 
decay, significant missing transverse energy, and one or two high-energy leptons with two or 
zero additional jets, depending on whether one or both W bosons from the top quarks decay 
leptonically. Our selected sample includes contributions from pairs of top quarks which both 
decay leptonically, but for which the second lepton is not reconstructed. It also includes 
contributions from pairs of top quarks with one leptonic decay which are selected as two-jet 
events because two out of four jets do not satisfy the selection criteria. The contribution 
from tt production to the iubb final state is estimated using simulated pythia events 15|] . It 
is normalized to the NLO cross section Q-7^o'lph calculated for rrit = 175GeV/c^ IGj. The 
small contribution from production of single top quarks is estimated using madevent 17 | 
and PYTHIA normalized to the NLO cross section [18]. 

Multijet events may have high-energy identified leptons or missing transverse energies, 
both mimicking the signature of W decay. These may be from semileptonic heavy fiavor 
decay or from false reconstructions. The identified leptons from such events are rarely iso- 
lated in energy, as required by our event selection, and seldom yield large missing transverse 
energy. We therefore calculate the number of non-W events in our selected sample by extrap- 
olating from sideband regions (defined in the space of lepton energy isolation and missing 
transverse energy) into the signal region jsj. 

Contributions from events with falsely tagged light-fiavor jets are estimated by measuring 
a false tag (or mistag) rate in generic jet data. To first order the negative tag rate is a good 
approximation of the mistag rate because light-fiavor jets, whose tracks are prompt, have 
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Gaussian reconstructed decay length distributions symmetric about zero [ll|. The mistag 
rate is further modified by the NN filter efficiency. The resulting overall mistag rate is applied 
to the VT+jets sample to yield the number of mistagged events present in the sample. 

Small contributions from electroweak backgrounds {WW, ZZ, WZ, and Z tt) are 
estimated using the most recent theoretical cross section calculations 19| , with acceptances 
calculated using fully simulated events from the pythia Monte Carlo program. 

The dominant uncertainty in the W + heavy flavor background is the calibration factor for 
simulation derived from multijet data [8|. Different simulation inputs give different factors, 
and we flnd a 35% relative error on the background from heavy flavor. The background 
from false tags has major uncertainties on the rate correction due to particle interactions 
in detector material and on the NN rejection factor. Both are 15% relative errors. Cross- 
checks of sideband data yield a 17% relative uncertainty on the wow-W multijet estimate. 
The electroweak background estimates for diboson and single top are subject to uncertainties 
in the 6-tagging efficiency and the cross section predictions. 

We use the large 6-tagged sample of W+1 jet events to derive a data-based scaling factor 
of 1.2 ±0.2, which corrects a residual mismatch between the heavy fiavor fraction correction 
factor in multijet data and the ly+jets sample. This single factor is applied to the W+ 
heavy fiavor background calibration for all jet multiplicities, and it improves the agreement 
for the sideband multiplicities of VT+l, 3, 4 jets. A summary of the estimated background 
contributions to the lepton + jets sample is shown in Table [H along with the results from 
the data sample. 

The signal process in which a Higgs boson decays to hh is expected to show a resonant 
peak in the dijet mass spectrum. Figures [1] and [2] show the dijet mass spectra in the single- 
and double-tagged 2-jet samples for the estimated background as well as for the observed 
events. A 115 GeV/c^ Higgs boson signal at ten times the SM rate is shown for comparison. 
There is no significant excess observed in the dijet mass spectrum. The largest discrepancy, 
for masses near lOOGeV/c^, is less than one standard deviation defined by the uncertainty 
on the background estimate. 

The acceptance for WH iubb, including leptonic r decays, is calculated from samples 
generated with the pythia Monte Carlo program using Higgs boson mass values between 
110 and 150GeV/c^. Additional efficiency factors include the trigger and identification 
efficiencies measured separately for the lepton types, and a 6-tagging efficiency data-to- 
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FIG. 1: Reconstructed dijet mass distributions for PF+2-jet events with a single 6-tag passing the 
NN filter. The histogram binning is the same used in the binned likelihood calculation. 
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FIG. 2: Reconstructed dijet mass distributions for Ty+2-jet events with two 6-tagged jets. The 
histogram binning is the same used in the binned hkehhood calculation. 



simulation scale factor. The acceptances for the single NN tag and double tag selections 
are 1.3 ± 0.1% and 0.4 ± 0.1%, including the W branching ratio to lepton pairs, for a 
mass hypothesis of 115GeV/c^. The dominant systematic uncertainty on the acceptance is 
the 6-tagging scale factor uncertainty, which is a 5.3% relative error for the single-tagged 
selection and a 16% relative error for the double-tagged selection. Variations in the final state 
radiation model introduce relative uncertainties of 3.2% and 8.6% into the NN single-tagged 
acceptance and double-tagged acceptance, respectively. Additional sources of systematic 
error include the jet energy scale, the lepton identification efficiency, and the initial state 
radiation model [ll|, Q . 

Limits on the number of Higgs boson events, interpreted as the production rate times 
the branching fraction, are derived using a binned likelihood technique assuming Poisson 
statistics. The sample is divided into bins of reconstructed dijet mass because the Higgs 
boson production signal is expected to show a mass resonance. By counting the number of 
events in each reconstructed mass bin, we discriminate more effectively between the peaked 
signal and the predicted background shape. Likelihoods are calculated separately for the 
single-tagged and double-tagged selections. The best exclusion comes from a combination 
of the single-tagged selection with the NN filter applied and the double-tagged selection. 
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To calculate the production limits, a Bayesian interval is constructed from the cumulative 
likelihood distributions and a prior probability density function uniform in the number of 
Higgs boson signal events s. The 95% confidence level upper limit is defined to be the value 
Sup for which /q*"^ L{s)ds/ L{s)ds = 0.95. The number of signal events is then converted 
to a Higgs boson production cross section times branching fraction a{pp WH) ■ Br{H — >■ 
hi). 

The observed 95% confidence level upper limits on the cross section times branching 
fraction range from 3.9 to 1.3 pb, for Higgs boson mass hypotheses from 110 to 150 GeV/c^, 
respectively. Figure [3] summarizes the observed limits as well as the expected limits as a 
function of the Higgs boson mass hypothesis. A set of background-only pseudoexperiments 
is generated for each mass; the median limit value in this set defines the expected limit 
in the absence of signal, and the spread of the pseudoexperiments defines the la band. 
The observed limit in the low mass region is roughly 2 standard deviations higher than the 
expected limit. 
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FIG. 3: 95% confidence level upper limit on Higgs boson production cross section times branching 
fraction as a function of Higgs boson mass hypothesis. The expected limits from background-only 
pseudoexperiments are shown in addition to the observed results from this search and previous 
CDF and DO searches. 

In this Higgs boson search, we have employed a novel neural network 6-tagging filter on 
a dataset nearly three times the size of previous searches. The resulting exclusion improves 
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significantly the limits on the allowed production rate for Higgs bosons in pp collisions. Even 
though the largest improvement by far comes from the larger dataset, separating the single- 
and double-tag samples results in a 20% improvement beyond the previous analysis, and 
rejecting charm and light-flavor jets with the NN gains another 5% in sensitivity. 
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